Skip to content

[None][feat] Add prefix-aware scheduling config flag to support opt-out#15526

Open
SimengLiu-nv wants to merge 3 commits into
NVIDIA:mainfrom
SimengLiu-nv:prefix-aware-flag
Open

[None][feat] Add prefix-aware scheduling config flag to support opt-out#15526
SimengLiu-nv wants to merge 3 commits into
NVIDIA:mainfrom
SimengLiu-nv:prefix-aware-flag

Conversation

@SimengLiu-nv

@SimengLiu-nv SimengLiu-nv commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator

Summary by CodeRabbit

  • New Features

    • Added enable_prefix_aware_scheduling configuration option to the scheduler, allowing fine-grained control over prefix-aware scheduling behavior for KV cache optimization (enabled by default).
  • Documentation

    • Updated guides with information on the new scheduler configuration option and its interaction with KV cache block reuse settings.

Description

Test Coverage

C++ focused gtests:
CapacitySchedulerTest.PrefixAwareSchedulingDisabledDoesNotDelayDuplicateRequest
CombinedSchedulerTest.PrefixAwareSchedulingDisabledKeepsReusableTokensZero
SerializeUtilsTest.SchedulerConfig

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • If PR introduces API changes, an appropriate PR label is added - either api-compatible or api-breaking. For api-breaking, include BREAKING in the PR title.

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

@SimengLiu-nv SimengLiu-nv requested review from a team as code owners June 22, 2026 21:27
@SimengLiu-nv SimengLiu-nv changed the title Add prefix-aware scheduling config flag [None][feat] Add prefix-aware scheduling config flag to support opt-out Jun 22, 2026
@SimengLiu-nv

Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #55084 [ run ] triggered by Bot. Commit: 61321d5 Link to invocation

@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

Introduces a new enable_prefix_aware_scheduling boolean flag (default true) in SchedulerConfig that controls whether schedulers use KV prefix-reuse estimates for admission and token-budget decisions. The flag propagates through C++ schedulers (MaxUtilizationScheduler, GuaranteedNoEvictScheduler, StaticBatchScheduler), Python schedulers (PyCapacityScheduler, KVCacheV2Scheduler), nanobind bindings, serialization, and the public Python API, with corresponding tests and documentation.

Changes

Prefix-Aware Scheduling Flag

Layer / File(s) Summary
SchedulerConfig contract and serialization
cpp/include/tensorrt_llm/executor/executor.h, cpp/tensorrt_llm/executor/schedulerConfig.cpp, cpp/tensorrt_llm/executor/serialization.cpp, tensorrt_llm/llmapi/llm_args.py
SchedulerConfig gains the enablePrefixAwareScheduling constructor parameter, getter, equality comparison, and serialization round-trip; the Pydantic SchedulerConfig adds the field and passes it through _to_pybind.
C++ scheduler header declarations
cpp/include/tensorrt_llm/batch_manager/capacityScheduler.h
Constructor signatures for MaxUtilizationScheduler, GuaranteedNoEvictScheduler, StaticBatchScheduler, and CapacityScheduler gain enablePrefixAwareScheduling = true; the first two classes gain a mEnablePrefixAwareScheduling private member.
C++ scheduler behavior when flag is false
cpp/tensorrt_llm/batch_manager/capacityScheduler.cpp
skippingIsRelevant and analyzePrefixReuse calls are gated on mEnablePrefixAwareScheduling; when disabled, a default PrefixReuseSummary is substituted and chunked-context contributed-block tracking is skipped; CapacityScheduler threads the flag into all three policy constructors.
C++ executor wiring
cpp/tensorrt_llm/batch_manager/trtGptModelInflightBatching.cpp, cpp/tensorrt_llm/batch_manager/trtEncoderModel.cpp
Both executor models now read getEnablePrefixAwareScheduling() from the executor's scheduler config and pass it into the CapacityScheduler constructor.
Nanobind and Python binding extensions
cpp/tensorrt_llm/nanobind/batch_manager/algorithms.cpp, cpp/tensorrt_llm/nanobind/executor/executorConfig.cpp
CapacityScheduler nanobind binding gains the enable_prefix_aware_scheduling argument; SchedulerConfig binding gains the constructor parameter, backward-compatible pickle state (size-3 or size-4 tuple), and a new read-only property.
Python scheduler behavior (PyCapacityScheduler / BindCapacityScheduler / SimpleUnifiedScheduler)
tensorrt_llm/_torch/pyexecutor/scheduler/scheduler.py, tensorrt_llm/_torch/pyexecutor/_util.py
Adds _disabled_prefix_summary() helper; gates _is_skipping_relevant, _prefill_contributed_blocks, and _beneficial_to_skip on the flag; updates GuaranteedNoEvictPolicy and MaxUtilizationPolicy summary lookup with the disabled fallback; wires the flag through _util.py into all three scheduler construction paths.
KVCacheV2Scheduler pre/post-prepare_context budgeting
tensorrt_llm/_torch/pyexecutor/scheduler/scheduler_v2.py
KVCacheV2Scheduler stores the flag and conditionally checks token budget before prepare_context (disabled mode) or after (enabled mode) in both _try_schedule_context_full and _try_schedule_context_chunked; KV-resize always uses post-prepare_context length.
C++ unit tests
cpp/tests/unit_tests/batch_manager/capacitySchedulerTest.cpp, cpp/tests/unit_tests/batch_manager/microBatchSchedulerTest.cpp, cpp/tests/unit_tests/executor/serializeUtilsTest.cpp
New tests assert zero reusable tokens and no request staggering for duplicate requests under both scheduler policies when the flag is false; serialization round-trip coverage added.
Python tests and binding tests
tests/unittest/_torch/executor/test_kv_cache_v2_scheduler.py, tests/unittest/_torch/executor/test_py_scheduler.py, tests/unittest/bindings/test_executor_bindings.py, tests/unittest/llmapi/test_llm_args.py
New tests for KVCacheV2Scheduler pre-reuse budget charging, PyCapacityScheduler duplicate-request non-delay, and binding pickle/property preservation; boolean assertion style tightened to identity checks across existing test_llm_args.py tests.
Design doc, user-facing docs, golden manifest, and telemetry schema
reviews/designs/enable_prefix_aware_scheduling.md, docs/source/features/kvcache.md, docs/source/developer-guide/telemetry.md, tensorrt_llm/usage/llm_args_golden_manifest.json, tensorrt_llm/usage/schemas/README.md
Design document added; kvcache.md explains the flag's effect; telemetry tables and schemas updated; golden manifest extended with the new scheduler field plus unrelated KV-cache and sparse-attention fields.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • NVIDIA/TensorRT-LLM#14398: Directly shares the tensorrt_llm/usage/llm_args_golden_manifest.json and telemetry documentation pipeline that this PR also extends with the new scheduler_config.enable_prefix_aware_scheduling entry.

Suggested labels

SW Architecture

Suggested reviewers

  • arysef
  • syuoni
  • venkywonka
  • nv-guomingz
  • chang-l
  • karljang
🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 30.30% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive The PR description provides test coverage details but lacks explanation of the issue, solution, and design rationale. The description template sections for 'Description' and 'Solution' are not filled in. Add a 'Description' section explaining the problem/motivation and a 'Solution' section describing how the prefix-aware scheduling flag works and its impact.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly identifies the main feature: adding a prefix-aware scheduling config flag with opt-out capability.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (5)
tests/unittest/llmapi/test_llm_args.py (1)

478-503: 📐 Maintainability & Code Quality | 🔵 Trivial

Coverage status: sufficient in tests/unittest/llmapi/test_llm_args.py.

test_SchedulerConfig_declaration now validates both default True and explicit False propagation to pybind, which is the key contract for this PR in this file. No follow-up needed here.

As per path instructions, tests/** reviews should state whether coverage is sufficient or needs follow-up.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unittest/llmapi/test_llm_args.py` around lines 478 - 503, No changes
are needed. The test_SchedulerConfig_declaration function already provides
sufficient coverage by validating both the default True value for
enable_prefix_aware_scheduling and the explicit False value propagation to
pybind through the PybindMirror conversion, which fulfills the key contract
requirements for this PR.

Source: Path instructions

tests/unittest/bindings/test_executor_bindings.py (1)

2495-2502: 📐 Maintainability & Code Quality | 🔵 Trivial

Coverage status: sufficient in tests/unittest/bindings/test_executor_bindings.py (after the lint fix above).

The pickle and nested ExecutorConfig assertions adequately verify round-trip preservation of enable_prefix_aware_scheduling; no extra follow-up test is needed for this file.

As per path instructions, tests/** reviews should state whether coverage is sufficient or needs follow-up.

Also applies to: 2644-2645, 2703-2703

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unittest/bindings/test_executor_bindings.py` around lines 2495 - 2502,
This is a review approval confirming that the test_scheduler_config_pickle
function adequately verifies round-trip preservation of the
enable_prefix_aware_scheduling attribute through pickle
serialization/deserialization with the appropriate assertions on the
SchedulerConfig object. No code changes are required as the test coverage is
sufficient and meets the stated requirements.

Source: Path instructions

tests/unittest/_torch/executor/test_kv_cache_v2_scheduler.py (1)

175-211: 📐 Maintainability & Code Quality | 🔵 Trivial

Coverage status: sufficient in tests/unittest/_torch/executor/test_kv_cache_v2_scheduler.py.

The added cases cover constructor wiring and disabled-prefix-aware budget semantics with explicit assertions; no follow-up coverage is needed in this PR for this file.

As per path instructions, tests/** reviews should state whether coverage is sufficient or needs follow-up.

Also applies to: 1987-2024

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unittest/_torch/executor/test_kv_cache_v2_scheduler.py` around lines
175 - 211, The review comment is confirming that test coverage is sufficient and
properly documented per path instructions for test files. No code fix is
required - the comment is an approval noting that the test cases in
make_scheduler and related tests adequately cover constructor wiring and
disabled-prefix-aware budget semantics with explicit assertions, meeting the
requirement to document coverage status in tests/** files.

Source: Path instructions

tests/unittest/_torch/executor/test_py_scheduler.py (1)

2852-2880: 📐 Maintainability & Code Quality | 🔵 Trivial

Coverage status: sufficient in tests/unittest/_torch/executor/test_py_scheduler.py.

This addition directly validates the disabled path (including “must not call prefix-reuse analysis”) and is adequate for this PR scope.

As per path instructions, tests/** reviews should state whether coverage is sufficient or needs follow-up.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unittest/_torch/executor/test_py_scheduler.py` around lines 2852 -
2880, This review comment is a positive confirmation that the test coverage is
sufficient and no fixes are required. The test method
test_prefix_aware_scheduling_disabled_does_not_delay_duplicates adequately
validates the disabled path for prefix-aware scheduling by ensuring that the
analyze_prefix_reuse method is not called when enable_prefix_aware_scheduling is
False. No code changes are needed based on this review comment.

Source: Path instructions

cpp/tensorrt_llm/batch_manager/capacityScheduler.cpp (1)

321-335: 🚀 Performance & Scalability | 🔵 Trivial | ⚡ Quick win

Guard remaining prefix-reuse tree walk on disabled mode.

When prefix-aware scheduling is disabled, this block bypasses first-chunk reuse analysis, but the encoder-init cross-summary path still performs analyzePrefixReuse later even though skip logic is disabled in that mode. Add the same flag guard there to avoid unnecessary radix-tree walks.

♻️ Suggested diff
-                else if (isEncoderInit && crossKvCacheManager && crossKvCacheManager->isEnableBlockReuse()
+                else if (mEnablePrefixAwareScheduling && isEncoderInit && crossKvCacheManager
+                    && crossKvCacheManager->isEnableBlockReuse()
                     && !crossKvCacheManager->getBlockManager().isVariableWindow())
                 {
                     // Encoder admission only needs the cross summary for reuse ordering.
                     auto uniqueTokens = *(req->getEncoderUniqueTokens().value());
                     crossSummary = crossKvCacheManager->analyzePrefixReuse(uniqueTokens, *req);
                 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/tensorrt_llm/batch_manager/capacityScheduler.cpp` around lines 321 - 335,
The code correctly guards the main kvCacheManager's prefix reuse analysis with
mEnablePrefixAwareScheduling flag, but the subsequent block that handles the
cross-summary path with crossKvCacheManager still performs analyzePrefixReuse
even when scheduling is disabled. Add the same mEnablePrefixAwareScheduling
check along with the necessary block reuse and variable window guards before the
crossKvCacheManager analyzePrefixReuse call to match the pattern used for
kvCacheManager and prevent unnecessary radix-tree walks when prefix-aware
scheduling is disabled.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cpp/tensorrt_llm/executor/schedulerConfig.cpp`:
- Around line 33-37: The SchedulerConfig::operator== method is incomplete
because it does not include mDynamicBatchConfig in its equality comparison.
Currently it only checks mCapacitySchedulerPolicy, mContextChunkingPolicy, and
mEnablePrefixAwareScheduling. Add a comparison of mDynamicBatchConfig to the
return statement using the logical AND operator to ensure that two
SchedulerConfig objects with different dynamic batch configurations will
correctly compare as not equal.

In `@cpp/tests/unit_tests/batch_manager/microBatchSchedulerTest.cpp`:
- Around line 1413-1435: Add a precondition assert after the capacityScheduler
call (which produces scheduled1) to verify that req1 is present in scheduled1
before proceeding to the microBatchScheduler checks. This ensures the subsequent
assertion checking that req1InCtx is false is testing the correct behavior (that
microBatchScheduler filtered out req1 due to token budget constraints) rather
than passing vacuously if req1 was already absent from scheduled1. Use
ASSERT_TRUE or EXPECT_TRUE with std::any_of to confirm req1 exists in scheduled1
before the microBatchScheduler call.

In `@tests/unittest/bindings/test_executor_bindings.py`:
- Around line 1409-1410: The assertions on the
config.dynamic_batch_config.enable_batch_size_tuning and
config.dynamic_batch_config.enable_max_num_tokens_tuning properties are using
explicit `== True` comparisons, which violates the E712 linting rule. Fix this
by removing the `== True` part from both assertions and relying on the
truthiness check of the boolean properties directly. This means simplifying each
assertion to just check the boolean property itself without the explicit
comparison operator.

---

Nitpick comments:
In `@cpp/tensorrt_llm/batch_manager/capacityScheduler.cpp`:
- Around line 321-335: The code correctly guards the main kvCacheManager's
prefix reuse analysis with mEnablePrefixAwareScheduling flag, but the subsequent
block that handles the cross-summary path with crossKvCacheManager still
performs analyzePrefixReuse even when scheduling is disabled. Add the same
mEnablePrefixAwareScheduling check along with the necessary block reuse and
variable window guards before the crossKvCacheManager analyzePrefixReuse call to
match the pattern used for kvCacheManager and prevent unnecessary radix-tree
walks when prefix-aware scheduling is disabled.

In `@tests/unittest/_torch/executor/test_kv_cache_v2_scheduler.py`:
- Around line 175-211: The review comment is confirming that test coverage is
sufficient and properly documented per path instructions for test files. No code
fix is required - the comment is an approval noting that the test cases in
make_scheduler and related tests adequately cover constructor wiring and
disabled-prefix-aware budget semantics with explicit assertions, meeting the
requirement to document coverage status in tests/** files.

In `@tests/unittest/_torch/executor/test_py_scheduler.py`:
- Around line 2852-2880: This review comment is a positive confirmation that the
test coverage is sufficient and no fixes are required. The test method
test_prefix_aware_scheduling_disabled_does_not_delay_duplicates adequately
validates the disabled path for prefix-aware scheduling by ensuring that the
analyze_prefix_reuse method is not called when enable_prefix_aware_scheduling is
False. No code changes are needed based on this review comment.

In `@tests/unittest/bindings/test_executor_bindings.py`:
- Around line 2495-2502: This is a review approval confirming that the
test_scheduler_config_pickle function adequately verifies round-trip
preservation of the enable_prefix_aware_scheduling attribute through pickle
serialization/deserialization with the appropriate assertions on the
SchedulerConfig object. No code changes are required as the test coverage is
sufficient and meets the stated requirements.

In `@tests/unittest/llmapi/test_llm_args.py`:
- Around line 478-503: No changes are needed. The
test_SchedulerConfig_declaration function already provides sufficient coverage
by validating both the default True value for enable_prefix_aware_scheduling and
the explicit False value propagation to pybind through the PybindMirror
conversion, which fulfills the key contract requirements for this PR.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 8c226c76-7bfb-4363-98cf-e6d650c30a47

📥 Commits

Reviewing files that changed from the base of the PR and between f536026 and 61321d5.

📒 Files selected for processing (25)
  • cpp/include/tensorrt_llm/batch_manager/capacityScheduler.h
  • cpp/include/tensorrt_llm/executor/executor.h
  • cpp/tensorrt_llm/batch_manager/capacityScheduler.cpp
  • cpp/tensorrt_llm/batch_manager/trtEncoderModel.cpp
  • cpp/tensorrt_llm/batch_manager/trtGptModelInflightBatching.cpp
  • cpp/tensorrt_llm/executor/schedulerConfig.cpp
  • cpp/tensorrt_llm/executor/serialization.cpp
  • cpp/tensorrt_llm/nanobind/batch_manager/algorithms.cpp
  • cpp/tensorrt_llm/nanobind/executor/executorConfig.cpp
  • cpp/tests/unit_tests/batch_manager/capacitySchedulerTest.cpp
  • cpp/tests/unit_tests/batch_manager/microBatchSchedulerTest.cpp
  • cpp/tests/unit_tests/executor/serializeUtilsTest.cpp
  • docs/source/developer-guide/telemetry.md
  • docs/source/features/kvcache.md
  • reviews/designs/enable_prefix_aware_scheduling.md
  • tensorrt_llm/_torch/pyexecutor/_util.py
  • tensorrt_llm/_torch/pyexecutor/scheduler/scheduler.py
  • tensorrt_llm/_torch/pyexecutor/scheduler/scheduler_v2.py
  • tensorrt_llm/llmapi/llm_args.py
  • tensorrt_llm/usage/llm_args_golden_manifest.json
  • tensorrt_llm/usage/schemas/README.md
  • tests/unittest/_torch/executor/test_kv_cache_v2_scheduler.py
  • tests/unittest/_torch/executor/test_py_scheduler.py
  • tests/unittest/bindings/test_executor_bindings.py
  • tests/unittest/llmapi/test_llm_args.py

Comment thread cpp/tensorrt_llm/executor/schedulerConfig.cpp
Comment thread cpp/tests/unit_tests/batch_manager/microBatchSchedulerTest.cpp
Comment thread tests/unittest/bindings/test_executor_bindings.py
@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #55084 [ run ] completed with state SUCCESS. Commit: 61321d5
/LLM/main/L0_MergeRequest_PR pipeline #44070 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@SimengLiu-nv

Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #55324 [ run ] triggered by Bot. Commit: c34ff84 Link to invocation

Comment thread tests/unittest/bindings/test_executor_bindings.py Outdated
@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #55324 [ run ] completed with state FAILURE. Commit: c34ff84
/LLM/main/L0_MergeRequest_PR pipeline #44275 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

Comment thread cpp/tensorrt_llm/nanobind/executor/executorConfig.cpp Outdated
Comment thread cpp/tensorrt_llm/batch_manager/capacityScheduler.cpp Outdated
Comment thread cpp/tensorrt_llm/executor/schedulerConfig.cpp Outdated
@joyang-nv joyang-nv requested a review from tongyuantongyu June 25, 2026 01:39

@tongyuantongyu tongyuantongyu left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Nit: the name enable_prefix_aware_scheduling feels a bit redundant. prefix_aware is probably enough here.

@chang-l chang-l left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approval for doc changes.

@SimengLiu-nv

Copy link
Copy Markdown
Collaborator Author

/bot run

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #56407 [ run ] triggered by Bot. Commit: 9ac53b3 Link to invocation

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #56407 [ run ] completed with state SUCCESS. Commit: 9ac53b3
/LLM/main/L0_MergeRequest_PR pipeline #45249 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

Signed-off-by: Simeng Liu <simengl@nvidia.com>
Signed-off-by: Simeng Liu <simengl@nvidia.com>
Signed-off-by: Simeng Liu <simengl@nvidia.com>
@SimengLiu-nv

Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #56451 [ run ] triggered by Bot. Commit: bcd57ce Link to invocation

@tensorrt-cicd

Copy link
Copy Markdown
Collaborator

PR_Github #56451 [ run ] completed with state FAILURE. Commit: bcd57ce
/LLM/main/L0_MergeRequest_PR pipeline #45293 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants